{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "28be220e",
   "metadata": {},
   "source": [
    "# Time Series Modeling with Amazon Forecast and DeepAR on SageMaker - Amazon Forecast"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "\n",
    "This notebook's CI test result for us-west-2 is as follows. CI test results in other regions can be found at the end of the notebook. \n",
    "\n",
    "![This us-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/us-west-2/introduction_to_amazon_algorithms|forecasting_services_comparison|forecast_example.ipynb)\n",
    "\n",
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "65d44706",
   "metadata": {},
   "source": [
    "## Introduction\n",
    "\n",
    "Amazon offers customers a multitude of time series prediction services, including **DeepAR on SageMaker** and the fully managed service **Amazon Forecast**. Both services are similar in some aspects, yet differ in others. This notebook series aims to highlight the similarities and differences between both services by demonstrating how each service is used as well as describing the features each service offers. As a result, both notebooks in the series will use the same dataset. We will consider a real use case using the [Beijing Multi-Site Air-Quality Data Set](https://archive.ics.uci.edu/ml/datasets/Beijing+Multi-Site+Air-Quality+Data) which features hourly air pollutants data from 12 air-quality monitoring sites from March 1st, 2013 to February 28th, 2017, and is featured in the [[1](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5627385/)] academic paper. This particular notebook will focus on **Amazon Forecast**, and will:\n",
    "- Explain Amazon Forecast model options\n",
    "- Demonstrate how to train an AutoPredictor\n",
    "- Create inferences from Amazon Forecast model\n",
    "\n",
    "One feature of **Amazon Forecast** is that the service can be used without any code. However, this notebook will outline how to use the service within a notebook format. Before you start, please note that training an **Amazon Forecast** may take several hours; this particular notebook took approximately `6 hours 30 minutes` to complete. Also, make sure that your SageMaker Execution Role has the following policies:\n",
    "\n",
    "- `AmazonForecastFullAccess`\n",
    "- `AmazonSageMakerFullAccess`\n",
    "- `IAMFullAccess`\n",
    "\n",
    "\n",
    "For convenience, here is an overview of the structure of this notebook:\n",
    "1. [Introduction](#Introduction)\n",
    " - [Preparation](#Preparation)\n",
    "2. [Data Preprocessing](#Data-Preprocessing)\n",
    " - [Data Import](#Data-Import)\n",
    " - [Data Visualization](#Data-Visualization)\n",
    " - [Train/Test Split](#Train/Test-Split)\n",
    " - [Target/Related Time Series Split](#Target/Related-Time-Series-Split)\n",
    " - [Upload to S3](#Upload-to-S3)\n",
    "3. [Dataset Group](#Dataset-Group)\n",
    "4. [Datasets](#Datasets)\n",
    " - [Create Datasets](#Create-Datasets)\n",
    " - [Dataset Import](#Dataset-Import)\n",
    "5. [Predictors](#Predictors)\n",
    " - [Predictor Options](#Predictor-Options)\n",
    " - [AutoPredictor](#AutoPredictor)\n",
    "6. [Forecast](#Forecast)\n",
    " - [Create Forecast](#Create-Forecast)\n",
    " - [Query Forecast](#Query-Forecast)\n",
    "7. [Resource Cleanup](#Resource-Cleanup)\n",
    "8. [Next Steps](#Next-Steps)\n",
    "\n",
    "### Preparation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "33ca593e",
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install seaborn --upgrade"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "360bbe83",
   "metadata": {},
   "outputs": [],
   "source": [
    "import boto3\n",
    "import os\n",
    "import pandas as pd\n",
    "import numpy as np\n",
    "import json\n",
    "import time\n",
    "import sagemaker\n",
    "from datetime import datetime\n",
    "from IPython.display import display\n",
    "\n",
    "import matplotlib.pyplot as plt\n",
    "import seaborn as sns\n",
    "\n",
    "# import forecast notebook utility library\n",
    "import util"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "be2a5ad0",
   "metadata": {},
   "source": [
    "To use Amazon Forecast, we're going to need to define an IAM role with the `AmazonForecastFullAccess` policy attached, as well as `AmazonS3FullAccess`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "dc474017",
   "metadata": {},
   "outputs": [],
   "source": [
    "role_name = \"DemoForecastRole\"\n",
    "role_arn = util.get_or_create_iam_role(role_name=role_name)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d9b8fca8",
   "metadata": {},
   "outputs": [],
   "source": [
    "session = boto3.Session()\n",
    "s3_client = session.client(\"s3\")\n",
    "forecast_client = session.client(\"forecast\")\n",
    "forecast_query_client = session.client(\"forecastquery\")\n",
    "region = session.region_name"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e612f5be",
   "metadata": {},
   "source": [
    "All paths and resource names are defined below for a simple overview for where each resource will be located:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "11e7ffea",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Remove paths if notebook was run before\n",
    "!rm -r data\n",
    "!rm -r forecast"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "11d58a53",
   "metadata": {},
   "outputs": [],
   "source": [
    "bucket = sagemaker.Session().default_bucket()\n",
    "sagemaker_sample_bucket = \"sagemaker-sample-files\"\n",
    "version = datetime.now().strftime(\"_%Y_%m_%d_%H_%M_%S\")\n",
    "\n",
    "dirs = [\"data\", \"forecast\", \"forecast/to_export\"]\n",
    "for dir_name in dirs:\n",
    "    os.makedirs(dir_name)\n",
    "\n",
    "dataset_s3_path = \"datasets/timeseries/beijing_air_quality/PRSA2017_Data_20130301-20170228.zip\"\n",
    "dataset_save_path = \"data/dataset.zip\"  # path where the zipped dataset is imported to\n",
    "dataset_path = \"data/dataset\"  # path where unzipped dataset is located\n",
    "tts_path = \"forecast/to_export/tts.csv\"\n",
    "rts_path = \"forecast/to_export/rts.csv\"\n",
    "tts_s3_path = \"demo-forecast/tts.csv\"\n",
    "rts_s3_path = \"demo-forecast/rts.csv\"\n",
    "dataset_group_name = \"demo_forecast_dsg_{}\".format(version)\n",
    "dataset_tts_name = \"demo_forecast_tts_{}\".format(version)\n",
    "dataset_rts_name = \"demo_forecast_rts_{}\".format(version)\n",
    "dataset_tts_import_name = \"demo_forecast_tts_import_{}\".format(version)\n",
    "dataset_rts_import_name = \"demo_forecast_rts_import_{}\".format(version)\n",
    "auto_predictor_name = \"demo_forecast_auto_predictor_{}\".format(version)\n",
    "forecast_name = \"demo_forecast_forecast_{}\".format(version)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1d6b6f3c",
   "metadata": {},
   "source": [
    "## Data Preprocessing\n",
    "This section prepares the dataset for use in **Amazon Forecast**. It will cover:\n",
    "- Target/Test dataset splitting\n",
    "- Target/Related time series splitting\n",
    "- S3 uploading"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "38a8dd64",
   "metadata": {},
   "source": [
    "### Data Import\n",
    "\n",
    "This section will be demonstrating how to import data from an S3 bucket, but one can import their data whichever way is convenient. The data for this example will be imported from the `sagemaker-sample-files` **S3 Bucket**. \n",
    "\n",
    "\n",
    "To communicate with S3 outside of our console, we'll use the **Boto3** python3 library. More functionality between **Boto3** and **S3** can be found here: [Boto3 Amazon S3 Examples](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/s3-examples.html)\n",
    "\n",
    "This particular dataset decompresses into a single folder named `PRSA_Data_20130301-20170228`. It contains 12 `csv` files, each containing air quality data for a single location. Each DataFrame will contain the following columns:\n",
    "- No: row number\n",
    "- year: year of data in this row\n",
    "- month: month of data in this row\n",
    "- day: day of data in this row\n",
    "- hour: hour of data in this row\n",
    "- PM2.5: PM2.5 concentration (ug/m^3)\n",
    "- PM10: PM10 concentration (ug/m^3)\n",
    "- SO2: SO2 concentration (ug/m^3)\n",
    "- NO2: NO2 concentration (ug/m^3)\n",
    "- CO: CO concentration (ug/m^3)\n",
    "- O3: O3 concentration (ug/m^3)\n",
    "- TEMP: temperature (degree Celsius)\n",
    "- PRES: pressure (hPa)\n",
    "- DEWP: dew point temperature (degree Celsius)\n",
    "- RAIN: precipitation (mm)\n",
    "- wd: wind direction\n",
    "- WSPM: wind speed (m/s)\n",
    "- station: name of the air-quality monitoring site\n",
    "\n",
    "#### Citations\n",
    "- Dua, D. and Graff, C. (2019). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "28e97a54",
   "metadata": {},
   "outputs": [],
   "source": [
    "s3_client.download_file(sagemaker_sample_bucket, dataset_s3_path, dataset_save_path)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ad24b93d",
   "metadata": {},
   "outputs": [],
   "source": [
    "!unzip data/dataset.zip -d data && mv data/PRSA_Data_20130301-20170228 data/dataset"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "71871bb7",
   "metadata": {},
   "outputs": [],
   "source": [
    "dataset = [\n",
    "    pd.read_csv(\"{}/{}\".format(dataset_path, file_name)) for file_name in os.listdir(dataset_path)\n",
    "]\n",
    "\n",
    "display(dataset[0])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "584acfef",
   "metadata": {},
   "outputs": [],
   "source": [
    "for df in dataset:\n",
    "    df.insert(0, \"datetime\", pd.to_datetime(df[[\"year\", \"month\", \"day\", \"hour\"]]))\n",
    "    df.drop(columns=[\"No\", \"year\", \"month\", \"day\", \"hour\"], inplace=True)\n",
    "\n",
    "display(dataset[0])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d81d4d34",
   "metadata": {},
   "source": [
    "### Data Visualization\n",
    "\n",
    "For this example, we'll use the temperature, or `TEMP` column, as our target variable to predict on. Let's first take a look at what each of our time series looks like."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a8f4d85b",
   "metadata": {},
   "outputs": [],
   "source": [
    "sns.set_style(\"dark\")\n",
    "fig, axes = plt.subplots(2, 3, figsize=(18, 10))\n",
    "fig.suptitle(\"Target Values\")\n",
    "\n",
    "for i, axis in zip(range(len(dataset))[:6], axes.ravel()):\n",
    "    sns.lineplot(data=dataset[i], x=\"datetime\", y=\"TEMP\", ax=axis)\n",
    "    axis.set_title(dataset[i][\"station\"].iloc[0])\n",
    "    axis.set_ylabel(\"Temperature (Celsius)\")\n",
    "fig.tight_layout()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5c036d52",
   "metadata": {},
   "source": [
    "![Dataset Visual](./images/dataset_visual.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b02716f6",
   "metadata": {},
   "source": [
    "### Train/Test Split\n",
    "Let's set the prediction horizon to the last 2 weeks of our time series. As a result, our test set should be the last `336` instances of each time series, as the frequency of our data is hourly."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1be40e63",
   "metadata": {},
   "outputs": [],
   "source": [
    "prediction_length = 14 * 24\n",
    "\n",
    "df_train = pd.concat([ts[:-prediction_length] for ts in dataset])\n",
    "\n",
    "df_test = pd.concat([ts.tail(prediction_length) for ts in dataset])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d3a61dd5",
   "metadata": {},
   "source": [
    "### Target/Related Time Series Split\n",
    "**Amazon Forecast** allows the use of a related time series, which contains other features that may increase the accuracy of our predictor. For simplicity, we'll use the PRES(pressure), RAIN(precipitation), and WSPM(wind speed) features for our related time series. More information can be found here: [Related Time Series Datasets](https://docs.aws.amazon.com/forecast/latest/dg/related-time-series-datasets.html)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e8c5bc07",
   "metadata": {},
   "outputs": [],
   "source": [
    "df_tts = df_train[[\"station\", \"datetime\", \"TEMP\"]]\n",
    "df_rts = df_train[[\"station\", \"datetime\", \"PRES\", \"RAIN\", \"WSPM\"]]\n",
    "\n",
    "df_tts_test = df_test[[\"station\", \"datetime\", \"TEMP\"]]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b60855dc",
   "metadata": {},
   "source": [
    "### Upload to S3"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8a212708",
   "metadata": {},
   "outputs": [],
   "source": [
    "df_tts.to_csv(tts_path, header=False, index=False)\n",
    "df_rts.to_csv(rts_path, header=False, index=False)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e9f3a55f",
   "metadata": {},
   "outputs": [],
   "source": [
    "s3_client.upload_file(tts_path, bucket, tts_s3_path)\n",
    "s3_client.upload_file(rts_path, bucket, rts_s3_path)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "51b751a2",
   "metadata": {},
   "source": [
    "## Dataset Group\n",
    "A dataset group is a container for all of our resources pertaining to one particular dataset. This includes target and related time series, predictors, and forecasts."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c3a8a1a5",
   "metadata": {},
   "outputs": [],
   "source": [
    "response = forecast_client.create_dataset_group(\n",
    "    DatasetGroupName=dataset_group_name, Domain=\"CUSTOM\"\n",
    ")\n",
    "\n",
    "dataset_group_arn = response[\"DatasetGroupArn\"]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2d75218b",
   "metadata": {},
   "source": [
    "## Datasets\n",
    "This section will go over:\n",
    "- Creating datasets\n",
    "- Creating schema\n",
    "- Importing datasets into dataset groups\n",
    "\n",
    "### Create Datasets\n",
    "When importing datasets into **Amazon Forecast**, a schema for the target and/or related time series must be defined. This is a dictionary that describes the "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c34486d7",
   "metadata": {},
   "outputs": [],
   "source": [
    "display(df_tts)\n",
    "display(df_rts)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "df16a727",
   "metadata": {},
   "outputs": [],
   "source": [
    "DATASET_FREQUENCY = \"H\"\n",
    "TIMESTAMP_FORMAT = \"yyyy-MM-dd hh:mm:ss\"\n",
    "\n",
    "tts_schema = {\n",
    "    \"Attributes\": [\n",
    "        {\"AttributeName\": \"item_id\", \"AttributeType\": \"string\"},\n",
    "        {\"AttributeName\": \"timestamp\", \"AttributeType\": \"timestamp\"},\n",
    "        {\"AttributeName\": \"target_value\", \"AttributeType\": \"float\"},\n",
    "    ]\n",
    "}\n",
    "\n",
    "response = forecast_client.create_dataset(\n",
    "    Domain=\"CUSTOM\",\n",
    "    DatasetType=\"TARGET_TIME_SERIES\",\n",
    "    DatasetName=dataset_tts_name,\n",
    "    DataFrequency=DATASET_FREQUENCY,\n",
    "    Schema=tts_schema,\n",
    ")\n",
    "\n",
    "tts_dataset_arn = response[\"DatasetArn\"]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9a659337",
   "metadata": {},
   "outputs": [],
   "source": [
    "rts_schema = {\n",
    "    \"Attributes\": [\n",
    "        {\"AttributeName\": \"item_id\", \"AttributeType\": \"string\"},\n",
    "        {\"AttributeName\": \"timestamp\", \"AttributeType\": \"timestamp\"},\n",
    "        {\"AttributeName\": \"PRES\", \"AttributeType\": \"float\"},\n",
    "        {\"AttributeName\": \"RAIN\", \"AttributeType\": \"float\"},\n",
    "        {\"AttributeName\": \"WSPM\", \"AttributeType\": \"float\"},\n",
    "    ]\n",
    "}\n",
    "\n",
    "response = forecast_client.create_dataset(\n",
    "    Domain=\"CUSTOM\",\n",
    "    DatasetType=\"RELATED_TIME_SERIES\",\n",
    "    DatasetName=dataset_rts_name,\n",
    "    DataFrequency=DATASET_FREQUENCY,\n",
    "    Schema=rts_schema,\n",
    ")\n",
    "\n",
    "rts_dataset_arn = response[\"DatasetArn\"]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e32cf170",
   "metadata": {},
   "outputs": [],
   "source": [
    "forecast_client.update_dataset_group(\n",
    "    DatasetGroupArn=dataset_group_arn,\n",
    "    DatasetArns=[\n",
    "        tts_dataset_arn,\n",
    "        rts_dataset_arn,\n",
    "    ],\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "71c042a2",
   "metadata": {},
   "source": [
    "### Dataset Import "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "99c4646b",
   "metadata": {},
   "outputs": [],
   "source": [
    "response = forecast_client.create_dataset_import_job(\n",
    "    DatasetImportJobName=dataset_tts_import_name,\n",
    "    DatasetArn=tts_dataset_arn,\n",
    "    DataSource={\n",
    "        \"S3Config\": {\"Path\": \"s3://{}/{}\".format(bucket, tts_s3_path), \"RoleArn\": role_arn}\n",
    "    },\n",
    "    TimestampFormat=TIMESTAMP_FORMAT,\n",
    ")\n",
    "\n",
    "tts_dataset_import_job_arn = response[\"DatasetImportJobArn\"]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7a6ebcc4",
   "metadata": {},
   "outputs": [],
   "source": [
    "response = forecast_client.create_dataset_import_job(\n",
    "    DatasetImportJobName=dataset_tts_import_name,\n",
    "    DatasetArn=rts_dataset_arn,\n",
    "    DataSource={\n",
    "        \"S3Config\": {\"Path\": \"s3://{}/{}\".format(bucket, rts_s3_path), \"RoleArn\": role_arn}\n",
    "    },\n",
    "    TimestampFormat=TIMESTAMP_FORMAT,\n",
    ")\n",
    "\n",
    "rts_dataset_import_job_arn = response[\"DatasetImportJobArn\"]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f5bdc857",
   "metadata": {},
   "outputs": [],
   "source": [
    "while True:\n",
    "    tts_status = forecast_client.describe_dataset_import_job(\n",
    "        DatasetImportJobArn=tts_dataset_import_job_arn\n",
    "    )[\"Status\"]\n",
    "    rts_status = forecast_client.describe_dataset_import_job(\n",
    "        DatasetImportJobArn=rts_dataset_import_job_arn\n",
    "    )[\"Status\"]\n",
    "    if tts_status == \"ACTIVE\" and rts_status == \"ACTIVE\":\n",
    "        break\n",
    "    if tts_status == \"CREATE_FAILED\" or rts_status == \"CREATE_FAILED\":\n",
    "        print(\"Dataset Import Job Failed\")\n",
    "        break\n",
    "    time.sleep(10)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "eda24161",
   "metadata": {},
   "source": [
    "## Predictors\n",
    "### Predictor Options\n",
    "**Amazon Forecast** offers six built-in algorithms:\n",
    "1. CNN-QR\n",
    "2. DeepAR+\n",
    "3. Prophet\n",
    "4. NPTS\n",
    "5. ARIMA\n",
    "6. ETS\n",
    "\n",
    "Optimal use cases for each algorithm can be found here: [Comparing Forecast Algorithms](https://docs.aws.amazon.com/forecast/latest/dg/aws-forecast-choosing-recipes.html#comparing-algos)\n",
    "\n",
    "In addition to multiple algorithms, **Amazon Forecast** offers three options for predictions:\n",
    "- **Manual Selection** - Manually select a single algorithm to apply to entire dataset\n",
    "- **AutoML** - Service finds and applies best-performing algorithm to entire dataset\n",
    "- **AutoPredictor** - Service runs all models and blends predictions with the goal of improving accuracy\n",
    "\n",
    "Manual selection and AutoML are considered Legacy models, and new features will only be supported by the AutoPredictor model. As a result, the AutoPredictor model will be used in this notebook. More information on **Amazon Forecast**'s AutoPredictor can be found here: [Amazon Forecast AutoPredictor](https://github.com/aws-samples/amazon-forecast-samples/blob/main/library/content/AutoPredictor.md)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4b3f28c4",
   "metadata": {},
   "source": [
    "### AutoPredictor"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a7a11559",
   "metadata": {},
   "source": [
    "This particular auto-predictor took approximately `6 hours 17 minutes` to train."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ef3c6c7e",
   "metadata": {},
   "outputs": [],
   "source": [
    "response = forecast_client.create_auto_predictor(\n",
    "    PredictorName=auto_predictor_name,\n",
    "    ForecastHorizon=prediction_length,\n",
    "    ForecastFrequency=DATASET_FREQUENCY,\n",
    "    DataConfig={\"DatasetGroupArn\": dataset_group_arn},\n",
    ")\n",
    "\n",
    "auto_predictor_arn = response[\"PredictorArn\"]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "86c1b356",
   "metadata": {},
   "outputs": [],
   "source": [
    "util.wait(lambda: forecast_client.describe_auto_predictor(PredictorArn=auto_predictor_arn))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d07568f8",
   "metadata": {},
   "source": [
    "## Forecast\n",
    "### Create Forecast"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "962913d3",
   "metadata": {},
   "outputs": [],
   "source": [
    "response = forecast_client.create_forecast(\n",
    "    ForecastName=forecast_name, PredictorArn=auto_predictor_arn\n",
    ")\n",
    "\n",
    "forecast_arn = response[\"ForecastArn\"]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d6b3d36d",
   "metadata": {},
   "outputs": [],
   "source": [
    "while True:\n",
    "    status = forecast_client.describe_forecast(ForecastArn=forecast_arn)[\"Status\"]\n",
    "    if status in (\"ACTIVE\", \"CREATE_FAILED\"):\n",
    "        break\n",
    "    time.sleep(10)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "78889b9f",
   "metadata": {},
   "source": [
    "### Query Forecast"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "63bb963c",
   "metadata": {},
   "outputs": [],
   "source": [
    "def plot_comparison(item_id):\n",
    "\n",
    "    response = forecast_query_client.query_forecast(\n",
    "        ForecastArn=forecast_arn, Filters={\"item_id\": item_id}\n",
    "    )\n",
    "\n",
    "    def query_to_df(query):\n",
    "        predictions = query[\"Forecast\"][\"Predictions\"]\n",
    "        dfs = []\n",
    "        for quantile in predictions:\n",
    "            temp = pd.DataFrame.from_dict(predictions[quantile]).rename(\n",
    "                columns={\"Timestamp\": \"datetime\", \"Value\": quantile}\n",
    "            )\n",
    "            temp[\"datetime\"] = pd.to_datetime(temp[\"datetime\"]).dt.tz_localize(None)\n",
    "            dfs.append(temp)\n",
    "        return pd.concat(dfs, axis=1).T.drop_duplicates().T\n",
    "\n",
    "    query = query_to_df(response)\n",
    "\n",
    "    plt.figure(figsize=(18, 10))\n",
    "    plt.plot(query[\"datetime\"], query[\"p10\"], color=\"r\", lw=1)\n",
    "    plt.plot(query[\"datetime\"], query[\"p50\"], color=\"orange\", linestyle=\":\", lw=2)\n",
    "    plt.plot(query[\"datetime\"], query[\"p90\"], color=\"r\", lw=1)\n",
    "    plt.plot(\n",
    "        query[\"datetime\"], df_tts_test[df_tts_test[\"station\"] == item_id][\"TEMP\"], color=\"b\", lw=1\n",
    "    )\n",
    "    plt.fill_between(\n",
    "        query[\"datetime\"].tolist(),\n",
    "        query[\"p90\"].tolist(),\n",
    "        query[\"p10\"].tolist(),\n",
    "        color=\"y\",\n",
    "        alpha=0.5,\n",
    "    )\n",
    "\n",
    "    plt.title(item_id)\n",
    "    plt.xlabel(\"Datetime\")\n",
    "    plt.ylabel(\"Temperature (Celsius)\")\n",
    "\n",
    "    plt.legend([\"10% Quantile\", \"50% Quantile\", \"90% Quantile\", \"Target\"])\n",
    "    plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3a26b5ab",
   "metadata": {},
   "outputs": [],
   "source": [
    "stations = df_tts_test[\"station\"].unique()\n",
    "plot_comparison(stations[0])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b28f7add",
   "metadata": {},
   "source": [
    "![Forecast Results](images/forecast_results.png)\n",
    "\n",
    "The plot above shows the target values and 10%, 50%, and 90% quantiles. The 10% and 90% quantiles produce an 80% confidence interval. "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6efb7a3d",
   "metadata": {},
   "source": [
    "## Resource Cleanup\n",
    "Let's clean up every resource we've created throughout this notebook. We'll have to delete our resources in a specific order:\n",
    "1. Delete Forecasts\n",
    "2. Delete Predictors\n",
    "3. Delete Dataset Imports\n",
    "4. Delete Datasets\n",
    "5. Delete Dataset Group\n",
    "6. Delete IAM Role"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "14e59959",
   "metadata": {},
   "outputs": [],
   "source": [
    "util.wait_till_delete(lambda: forecast_client.delete_forecast(ForecastArn=forecast_arn))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d708572e",
   "metadata": {},
   "outputs": [],
   "source": [
    "util.wait_till_delete(lambda: forecast_client.delete_predictor(PredictorArn=auto_predictor_arn))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "25d114b8",
   "metadata": {},
   "outputs": [],
   "source": [
    "util.wait_till_delete(\n",
    "    lambda: forecast_client.delete_dataset_import_job(\n",
    "        DatasetImportJobArn=tts_dataset_import_job_arn\n",
    "    )\n",
    ")\n",
    "util.wait_till_delete(\n",
    "    lambda: forecast_client.delete_dataset_import_job(\n",
    "        DatasetImportJobArn=rts_dataset_import_job_arn\n",
    "    )\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4f973a09",
   "metadata": {},
   "outputs": [],
   "source": [
    "util.wait_till_delete(lambda: forecast_client.delete_dataset(DatasetArn=tts_dataset_arn))\n",
    "util.wait_till_delete(lambda: forecast_client.delete_dataset(DatasetArn=rts_dataset_arn))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "10c10701",
   "metadata": {},
   "outputs": [],
   "source": [
    "util.wait_till_delete(\n",
    "    lambda: forecast_client.delete_dataset_group(DatasetGroupArn=dataset_group_arn)\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "817f4361",
   "metadata": {},
   "outputs": [],
   "source": [
    "util.delete_iam_role(role_name)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f10a3545",
   "metadata": {},
   "source": [
    "## Next Steps"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "889f4e68",
   "metadata": {},
   "source": [
    "This notebook illustrates the features offered by **Amazon Forecast**, and is part of the [Time Series Modeling with Amazon Forecast and DeepAR on SageMaker](.) series. The notebook series aims to demonstrate how to use the **Amazon Forecast** and **DeepAR on SageMaker** time series modeling services as well as outline their features. Be sure to read the [DeepAR on SageMaker](./deepar_example.ipynb) example, and view a top-level comparison of both services in the [README](./README.md)."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Notebook CI Test Results\n",
    "\n",
    "This notebook was tested in multiple regions. The test results are as follows, except for us-west-2 which is shown at the top of the notebook.\n",
    "\n",
    "![This us-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/us-east-1/introduction_to_amazon_algorithms|forecasting_services_comparison|forecast_example.ipynb)\n",
    "\n",
    "![This us-east-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/us-east-2/introduction_to_amazon_algorithms|forecasting_services_comparison|forecast_example.ipynb)\n",
    "\n",
    "![This us-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/us-west-1/introduction_to_amazon_algorithms|forecasting_services_comparison|forecast_example.ipynb)\n",
    "\n",
    "![This ca-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ca-central-1/introduction_to_amazon_algorithms|forecasting_services_comparison|forecast_example.ipynb)\n",
    "\n",
    "![This sa-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/sa-east-1/introduction_to_amazon_algorithms|forecasting_services_comparison|forecast_example.ipynb)\n",
    "\n",
    "![This eu-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/eu-west-1/introduction_to_amazon_algorithms|forecasting_services_comparison|forecast_example.ipynb)\n",
    "\n",
    "![This eu-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/eu-west-2/introduction_to_amazon_algorithms|forecasting_services_comparison|forecast_example.ipynb)\n",
    "\n",
    "![This eu-west-3 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/eu-west-3/introduction_to_amazon_algorithms|forecasting_services_comparison|forecast_example.ipynb)\n",
    "\n",
    "![This eu-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/eu-central-1/introduction_to_amazon_algorithms|forecasting_services_comparison|forecast_example.ipynb)\n",
    "\n",
    "![This eu-north-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/eu-north-1/introduction_to_amazon_algorithms|forecasting_services_comparison|forecast_example.ipynb)\n",
    "\n",
    "![This ap-southeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ap-southeast-1/introduction_to_amazon_algorithms|forecasting_services_comparison|forecast_example.ipynb)\n",
    "\n",
    "![This ap-southeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ap-southeast-2/introduction_to_amazon_algorithms|forecasting_services_comparison|forecast_example.ipynb)\n",
    "\n",
    "![This ap-northeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ap-northeast-1/introduction_to_amazon_algorithms|forecasting_services_comparison|forecast_example.ipynb)\n",
    "\n",
    "![This ap-northeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ap-northeast-2/introduction_to_amazon_algorithms|forecasting_services_comparison|forecast_example.ipynb)\n",
    "\n",
    "![This ap-south-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ap-south-1/introduction_to_amazon_algorithms|forecasting_services_comparison|forecast_example.ipynb)\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "conda_python3",
   "language": "python",
   "name": "conda_python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}